Reducing Labeling Effort for Structured Prediction Tasks
نویسندگان
چکیده
A common obstacle preventing the rapid deployment of supervised machine learning algorithms is the lack of labeled training data. This is particularly expensive to obtain for structured prediction tasks, where each training instance may have multiple, interacting labels, all of which must be correctly annotated for the instance to be of use to the learner. Traditional active learning addresses this problem by optimizing the order in which the examples are labeled to increase learning efficiency. However, this approach does not consider the difficulty of labeling each example, which can vary widely in structured prediction tasks. For example, the labeling predicted by a partially trained system may be easier to correct for some instances than for others. We propose a new active learning paradigm which reduces not only how many instances the annotator must label, but also how difficult each instance is to annotate. The system leverages information from partially correct predictions to efficiently solicit annotations from the user. We validate this active learning framework in an interactive information extraction system, reducing the total number of annotation actions by 22%.
منابع مشابه
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding social biases found in web corpora. In this work, we study data and models associated with multila...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Confidence Estimation in Structured Prediction
Structured classification tasks such as sequence labeling and dependency parsing have seen much interest by the Natural Language Processing and the machine learning communities. Several online learning algorithms were adapted for structured tasks such as Perceptron, PassiveAggressive and the recently introduced Confidence-Weighted learning . These online algorithms are easy to implement, fast t...
متن کاملHands-on Learning to Search for Structured Prediction
Many problems in natural language processing involve building outputs that are structured. The predominant approach to structured prediction is “global models” (such as conditional random fields), which have the advantage of clean underlying semantics at the cost of computational burdens and extreme difficulty in implementation. An alternative strategy is the “learning to search” (L2S) paradigm...
متن کاملStructured Prediction via Learning to Search under Bandit Feedback
We present an algorithm for structured prediction under online bandit feedback. The learner repeatedly predicts a sequence of actions, generating a structured output. It then observes feedback for that output and no others. We consider two cases: a pure bandit setting in which it only observes a loss, and more fine-grained feedback in which it observes a loss for every action. We find that the ...
متن کاملDeep Learning in Lexical Analysis and Parsing
Lexical analysis and parsing tasks, modeling deeper properties of the words and their relationships to each other, typically involve word segmentation, part-ofspeech tagging and parsing. A typical characteristic of such tasks is that the outputs have structured. All of them can fall into three types of structured prediction problems: sequence segmentation, sequence labeling and parsing. In this...
متن کامل